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ABSTRACT 


M. 


U.  Clauser 
Provost 


The  large  random  access  memory  and  high  internal  speeds  of  present 
day  computers  can  be  used  to  increase  the  efficiency  of  large-scale  simu- 
lation experiments  by  estimating  simultaneously  several  quantiles  of  each 
of  several  statistics.   In  order  to  do  this  without  inordinately  increas- 
ing programming  complexity,  quantile  estimation  schemes  are  required  which 
are  simple  and  do  not  depend  on  special  features  of  the  distributions  of 
the  statistics  considered.   We  discuss  limitations,  when  the  probability 
level   a   is  very  high  or  very  low,  of  two  basic  methods  of  estimating 
quantiles.   One  method  is  the  direct  use  of  order  statistics;  the  other 
is  based  on  the  use  of  stochastic  approximation. 

Several  modifications  of  these  two  estimation  schemes  are  considered, 
In  particular  a  simple  and  computationally  efficient  transformation  of  the 
simulation  data  is  proposed  and  the  properties  (i.e.  bias  and  variance)  of 
quantile  estimates  based  on  this  scheme  are  discussed. 


Prepared  by: 


INTRODUCTION 

Consider  a  situation  in  which  we  have  a  collection  of  random 

variables     X      X    ,  „  .  .  ,  X      with  joint  distribution    F(x    ,  .,„     x    ),    and 
1        L  n  In 

several  statistics  (functions  of  these    n    random  variables),    say 

s(n)  =  g.  (X    , .  .  .     X   ),    T(n)  =  g9(X    , .  .  .  ,  X   ),    etc.      It  is  required  to 
11  n  c      1  n 

estimate  the  distributions    F   (s),    Fj(t),    ...    of  these  statistics  (or   some 

characteristics  of  the  distributions)  by  obtaining    m     samples 

x,     . ,    x        ,    „..,   x      .,     i  =  l,    ...,    m,    from    F(x    ,  „  „  .  ,  x   )    and  hence    m 
l,i6,i  n,i  1  n 

values  for  each  of  the  statistics     S(n),    T(n),    ....      Two  examples  of  this 

type  of  situation  are  as  follows: 

i)        In  testing  for  independence  in  a  time  series     {X.},    many  test 

statistics  have  been  proposed,,      These  are  functions  of  finite  sets  of  the 

{X.},    namely    X    ,  „  „  0  ,  X    ,    and  the  hypothesis  is  that    F(x    ,  ...,x    ) 
l  1  n  In 

n 

=  IT  ,    Fv  (x.)0      Typical  statistics  are  the  sample  serial  correlation  co- 
1=1      X.      i 

l 

efficients  with  various  delays  (lags),    i0  ea 

1     n-i 
—  X.  (x.   -  x)(x.    a  -  5E) 
/    \        n-^   i  =  l  v    i                i+l  .       .     ~ 

p  ^(n)    = J J^ 2 '  =     »     »*  ••  'n_1» 

n      ill   (Xi"^ 

where    x  =  (x  +.„.+x   ) /n,     statistics  based  on  the  finite  Fourier  trans- 
1  n 

form  of  the    x  's    which  test  essentially  that  the  spectrum  of    {X.}    is 

i  1 

flat,  and  several  non-parametric  test  statistics  such  as  those  based  on 
runs.  The  distributions  of  most  of  these  statistics  are  known  for  inde- 
pendent,   normally  distributed    X.'s,    but  not  when  the  assumption  of  a 


normal  distribution  is  removed.        In  testing  for  a  renewal  hypothesis 
in  series  of  events  (Cox  and  Lewis,    1966,   p.    164)    an  exponential  distri- 
bution for  the     X.'s    may  be  reasonable,,      The  null  distributions  of  the  test 

1 

statistics  are  then  unknown,    as  are  the  rate  of  convergence  to  the  limit- 
ing   (n  —  co)    distributions   (some  of  which  are  known). 

In  examining  and  tabulating  the  finite  sample  distributions,    it  may 
be  required  to  estimate  the  distributions  of  several  of  these  test  statistics 
for  many  different  values  of    n„ 

ii)        There  is  much  interest  in  analyzing  very  complex  queueing 
and  congestion  systems,    particularly  those  that  arise  in  computing  and 
communication  contexts.      Here  one  might  be  interested  in  estimating  by 
simulation  the  distributions  of  the  waiting  times  at  several  points  in  the 
system  at  several  different  times  during  its  evolution. 

In  looking  at  the  problem  of  estimating  these  distributions  from 
m  replications  of  the  statistics,  several  general  problems  arise  which 
need  to  be  considered. 

First,    it  is  neither  practical  nor  desirable  to  save  all  of  the  in- 
formation generated  on  a  statistic  by  the  simulation     in  the  form  of  the 
empirical  distribution  function  or,    equivalently ,    in  the  form  of  the  com- 
plete set  of    m    order  statistics.      Some  compact  characterization  of  the 
distribution  is  required.     In  situations  such  as  that  of  the  first  example 
given  above,   out  of  which  this  present  study  in  fact  arose,   the  character- 
istics of  the  distribution  function  chosen  were  the  first  four  moments 


and  sixteen  quantiles  of  the  distribution.     (A  quantile    x       of  a  distribution 


a 


F(x)     is  defined  as  the  solution  of  the  equation 

or  =  F(  x    ) ,  0.  0  <  a  <  1 .  0  , 


a 


and  we  shall  be  assuming  throughout  this  paper  that    x       is   unique). 
The  probability  levels  chosen  for  the  quantiles  were    a  =  0.  001  ,    0.  002,   0.  005, 
0.010,  0.0?0,    0.025,    0.050,    0.100,   and   a  «  0.900,    0.950,    0.975,    0.980, 
0.990,    0.995,    0.998,    0.999.      The  choice  of  some  of  these    a's    is  based 
on  the  levels  customarily  used  in  testing  statistical  hypotheses;  the  more 
extreme  values  have  been  chosen  rather  arbitrarily  to  characterize  the 
extreme  tails  of  the  distributions.      In  many  queueing  situations  it  is  these 
extreme  values,    rather  than  moments,   which  are  of  interest. 

Another  possible  characteristic  is  the  probability  of  the  statistic 
being  less  than  a  given  value.      These  percentiles  are  clearly  important 
in  studies  of  the  power  of  test  statistics  against  non-null  hypotheses. 
Their  estimation  is  fairly  straightforward  and  will  not  be  considered  here, 

A  second  point  concerns  the  measurement  of  statistical  efficiency 
by  either  the  variance  or  the  mean  square  error  of  the  estimator.      The 
mean   square  error  is  the  variance  of  the  estimator  plus  its  bias  squared. 
In  large  scale  applications  of  simulations,    as  treated  in  this  study,    it  is 
important  to  obtain  internal  assessments  of  the  variability  of  the  estima- 
tion procedures  by,    for  instance,    estimating  a  quantile  as  the  average  of 

r     estimates  from  samples  of  size     m    .    where     r  m     =  m.      The  sample 

r  r 


i  fz 

standard  deviation  calculated  from  the     r     estimates,    divided  by   (r  )   '     , 
then  estimates  the  sampling  standard  deviation  of  the  quantile  estimator  (see 
Mosteller  and  Tukey,    1968,   for  more  details).      In  order  to  assess  the 
internal  variability  of  the  estimation  in  this  way  the  bias  of  the  estimator 
must  be  small  compared  to  the  standard  deviation  of  the  estimator. 
Otherwise     we  obtain,    from  one  point  of  view,    a  significant  bias  component 
in  the  mean  square  error,    or  from  another  point  of  view,    an  estimate 
with  very  small  sampling  variance  of  the  wrong  quantity,    i.  e.  ,    true 
quantile  plus  bias. 

Bias  thus  becomes  a  very  important  factor  in  assessing  the 
quantile  estimates  discussed  in  this  paper. 

A  third  consideration  is  that  in  some  cases  one  can  find  particular 
properties  of  a  statistic  whose  distribution  is  to  be  estimated  that  allow 
one  to  obtain  estimates  that  are  more  "efficient"  than  those  obtained  by 
straight  synthetic  sampling.      By  "efficiency"  here  we  mean  statistical 
efficiency,   or  the  relative  variances  or  mean  square  errors  of  different 
estimating  procedures.     However  there  are  other  less  tangible  costs  in- 
volved in  simulation.      One  is  the  time  involved  in  deriving  a  particular 
procedure  for  a  given  statistic,    another  the  time  involved  in  programming 
and  debugging  such  a  procedure  and  the  delay  in  obtaining  results,,      Still 
another  cost  is  the  actual  computing  time  involved  though  this  is  seldom 
mentioned  in  the  statistical  literature.     (This  latter  point  will  become 
clearer  later  in  the  paper.  )     These  less  tangible  costs  are  an  important 


component  of  overall  computational  efficiency.      The  point  of  view  taken 
in  this  paper  is  that  the  rapidly  accelerating  availability  of  large  memory, 
high  internal  speed  computers  makes  it  usually  more  "efficient",    in  the 
general  computational  sense,    to  forego  using  special  techniques  for  each 
particular  statistic  and  to  use  very  simple  straightforward  simulation 
techniques^      Thus  a  criterion  for  the  quantile  estimation  techniques  dis- 
cussed here  is  programming  and  computing  simplicity. 

We  do  not  mean  to  imply  by  this  that  all  sophisticated  statistical 
and  Monte  Carlo  techniques  are  not  useful  or  applicable.     Global  techniques 
such  as  jackknifed  estimates  (Quenouille,    1956)  or  the  use  of  variance 
reduction  techniques  such  as  control  variables  (Gaver,    1969)  can  be  used 
with  the  quantile  estimation  methods  discussed  in  this  paper.      The  jack- 
knife  technique  will  be  discussed  in  the  next  section  and  the  use  of  the 
quantile  estimation  techniques  in  the  context  of  sophisticated  Monte  Carlo 
will  be  discussed  elsewhere. 

Finally,    it  is  perhaps  worthwhile  to  give  some  idea  of  the  numbers 
envisioned  in  connection  with  the  estimation  procedures.      Clearly  no 
scheme  involving  only  raw   simulation  will  work  satisfactorily  in  estimat- 
ing a     0.  001        quantile    (x      nni)    unless  the  number    m    of  replications 
involved  is  substantially  greater  than    1000.         Typically  in  the  COMPSTAT 
program  (Goodman  and  Lewis,    1972),    for  which  these  techniques  were 
developed,    replications  of  100,000  or  more  are  common.      These  are  not 
excessive  on  the  IBM  360/91   on  which  the  runs  were  made.      In  addition, 


the  large  core  memory  of  this  machine  enabled  us  to  run  sampling  experi- 
ments on  up  to  31    statistics  simultaneously. 

The  following  discussion  uses  as  examples,    for  the  most  part, 
the  extreme     0o  001     and     0.  999     quantiles.      The  techniques  are  still 
useful,    though  not  as  much  so,    for  the  inner  quantiles0      Some  discussion 
of  the  simultaneous  estimation  of,    for  example,    all  sixteen  quantiles 
listed  above  is  given.      The  dependence  of  the  utility  and  efficiency  of  the 
various  quantile  estimation  schemes  on  the  particular  quantile  or  set  of 
quantiles  plus,    for  example,   variations  in  the  complexity  of  computation 

of  the  series     X    ,  .  .  .  ,  X       and  the  statistics     S(n),    T(n),...      indifferent 
1  n 

problems  make  it  difficult  to  be  dogmatic  about  the  relative  utility  of 
various  quantile  estimation  schemes.      In  addition,    most  of  the  results 
required  are  finite  sample  results.      These  are  difficult  to  obtain  analytic- 
ally and  expensive,    as  yet,    to  obtain  computationally. 

II.      QUANTILE  ESTIMATION 
a)   Overall  Considerations 

Two  general  methods  of  quantile  estimation  are  considered,    one 
based  on  the  order  statistics  of  the  sample,    the  other  based  on  stochastic 
approximation  (Robbins-Monro)  techniques    (see,    e.  g.  ,    Robbins  and 
Monro  [1951],    Hodges  and  Lehman  [1956],    Cochran  and  Davis  [1965])„ 

For  simplicity  we  drop  the  notation  which  indicates  de- 
pendence on    n    and  write     S(n)    as     S    and  write  its  distribution 
function  simply  as     F(s).      A  collection  of    m    independent  random 


variables  with  distribution    F(s)    will  be  denoted  by    S    ,..«,S,..#.S 

1  1  rn 

and  the  corresponding  order  statistics  by    S.       <  S._.  <  .  „  ,    <  S,   .  < .  „  .  <  S 

(1)         (2)  (i)  (m) 

The  order  statistic  estimator  of  the     a-quantile     s    ,   where 
a  -  F(s    ),    is 


a 


s       *=    s/r         ii»  (2.D 

a  ([arm]) 

where    [a  m]    denotes  the  integral  part  of    a  m.      Thus  for    a  =  0,  999    and 

m  =  1  0,  000,      s~    0rt^    *  S  lnnnnx  0 
0o999  (9990)° 

The   stochastic  approximation  estimate     s    (m)  is  defined  to  be  the 

a 

m  th    value  in  the  sequence  defined  by 


i  (i)  =  s  (i-n-  f 

or  a  i 


1    -   sgn 
2 


; al        (i  =  1,2,.  .  .  ,m), 


(2.2) 


where     sgn  x  =  1     if    x  >  0    and    -1     if    x  <  0,   and     s    ( 0)    is  an  arbitrary 

en 

initial  value.       If  the  constant    C    is  chosen  to  be    l/f(  s    ),   where    f(s    )    is 

a  a 

the  density  associated  with    F(s)     evaluated  at  the  quantile     s    ,    then  the 
asymptotic  variance    (m— •-co)     of     s   (m)     is  minimized,,      In  fact, 


-A 


E  {s   (m)  }~   s  (2.3) 

a  a 


and 


var     {^(m)}^ *^  ~a)         .  (2.4) 

m  f   (s    ) 
a 

Remarkably,   the  estimate     s      has  the  same  asymptotically  normal 

distribution  as  does     s   (m)„      Results  on  rates  of  convergence  are  known 

a 

for     s       but  not  for     s   (m).      This  will  be  discussed  later,    but  significant 
a  a 


comparisons  can  be  made  on  the  basis  of  the  asymptotic  results  and  on 

known  computational  results. 

Order  Statistics,    s 
'-—a 

computation  time  (to  order  the    m    realizations  of    S)     is  pro- 
portional to    m  ln(m); 

computer  memory  required  for  the  sorting  process  is  proportional 
to    m.     (Actually,    to    a  m    if    a  <  —  ,     (1  -a)  m    if    a  >  —■  ); 

-  no  initial  values  or  knowledge  of    F(s)    is  required; 
the  rate  of  convergence  of  the  estimate  is  known, 

A  A 

Stochastic  Approximation,      s     =  s   (m) 
i. a Qri^ — 

computation  time  (binary  comparison)  is  proportional  to  m; 

computer  memory  required  is  proportional  to  2} 

initial  values     s   (0)    and  values  for    f ( s    )    are  needed,   presumably 
a  a 

previous  estimates  or  guesses  based  on  prior  knowledge; 

-  no  reliable  results  known  on  the  rate  of  convergence  of    s     ,   or 

a 

even  of    E(s    ); 

it  is  not  necessary  to  know     S,     exactly,   only  that  it  is  greater 

A 

than  or  less  than    s    (i).      This  can  be  very  advantageous    if  com- 
putation of    S    is  time-consuming. 

In  summary,      s       has  very  definite  advantages  over     s      in  terms 
a  ot 

of  computational  considerations.     One  might  say  that  the  asymptotic 

relative  efficiency  of     s        compared  to    s    ,    in  terms  of  real  time  and  not 

a  a 

A 

sample  size    m,    is  zero-      However,    initial  values  are  needed  for     s        so 
r  a 


that  the  asymptotic  results  really  beg  the  question.  Further  differences 
appear  in  terms  of  finite  sample  properties  of  the  estimators,  and  these 
are  discussed  next. 

III.      QUANTILE  ESTIMATION      -  Finite  Sample  Considerations  for    1" 

a 

It  is  well  known  (David,    1971  )   that  for  the  ordered  sample 
S(l)<S(2)<oao<S(i)<0°°    <S(m) 

Fs      (s)  =prob  {S(i)  <  s  }  -  J" .    ^)[F(s)]k[l-F(s)]m-k,  (3.1) 


E{-s    }    =  E{S  }    =     s     +  o(^),  (3.3) 

a  {[a  mj )  a  \  m/ 


f~      (s)  =  F»       (■)-(.  [F(s)]1'1   [i-F(s)]m"1  f(s),  (3.2) 

S(i)  ID 


a{]'a)      +0/-L-\  (3o4) 

\  m     / 


a  ([a  m])  ,2 

L  J  mf  (s    ) 

a 


Asymptotic  expansions  for    E  {s     }    are  known  (David  and  Johnson 
[1954]  and  Clark  and  Williams  [1958] ),    the  first  terms  in  the  expansion 

being 

ff(s    )  ,,       x         2[(f?(s    ))2"f(s    )f"(s    )]  ,    ..       Wl    ,    v 

it/       \  V    a'  g(l-or)  L  a"  <*  g    J  2g(l-a)  (l-2q) 

E{V"  %"         3,  '     m+2  ,    .5.      .  '     (m+2)  (m+3)  °"> 

2f  (s    )  6  f  (s   ) 

(3.5) 

where  derivatives  are  denoted  by  primes  and  powers  by  arable  numerical  exponents. 
No  precise  conditions  for  these  asymptotic  expansions  seem  to  be  known,   and 
they  must  be  used  with  care.      For  example,   if    S  is  uniformly  distributed  be- 
tween   0    and    1,    s     «a   and,   using  (3.  2),     E  {S      }  =  i/(m+l )  =<i/m)-i/[(m)(m+ 1)]. 

a  (i) 


10 


Thus  if    amis  the   integer    i,     we  have 


E  {  T     }    =    a  - 


a 


a  J  (m+1)  ' 

but  the  second  term  of  the  expansion    (3.  5)    is  zero.      Similarly,    in  the 
extreme  but  computationally  useful  case    S  =       ■■ «  ,   where    U    is  uni- 
form,  we  have 

a  distribution  with  an  infinite  mean,      Surprisingly,    however,      s"      can  be 

shown,    using  (3„  2)  to  be  an  unbiased  estimator  of    s   „      In  this  case  the 

expansion  (3„5  )  does  not  even  converge. 

An  important  consequence  of  (3.  3)  is  that  the  jackknife  technique 

(Quenouille    [1956],    Mosteller  and  Tukey  [1966])  for  eliminating  an    0(— ) 

term  in  the  bias  can  be  applied  to  the  order  statistic  estimate    "s    .      By 

a 

way  of  illustration  consider  the  technique  being  applied  with  just  a  simple 

splitting  of  the  data.      Thus,    assume    m    is  even  and    s"  (1)    is  the  order 

statistic  estimator  from  the  first    m/2    S.'s,     s   (2)    the  estimator  from 

l  a 

the  second    m/2    S.'s.      The  "typical  values"  are  defined  to  be 


s   (1)    =    2¥    -7(1),  s(2)    -    2s     -s(2),  (3.7) 

a  a         a  a  a         a 

and  the  jackknifed  estimate  is 

f  (1)+  f  (2)  s"  (1)  T  (2) 

s       „  . ,  2s .  (3.  8) 

a  2  a  2  2 


11 


The    0( )    term  in    E[s    1    is  zero.      There  is  no  appreciable 

m  or  rr^ 

computing  cost  involved  in  obtaining  the  jackknifed  estimate.      The  two 

halves  of  the  sample  can  be  sorted  in  place  to  obtain    "s    (1)    and    "s    (2), 

or  a 

and  then  these  sorted  halves  are  merged  to  obtain  the  complete,    sorted 

sample     S        <ot,<S         «      This  in  fact  is  just  the  usual  binary  sorting 
(1)  (m) 

procedure  which  results  in      m  In  (m)    sorting  time. 

The  main  drawback  to  the  jackknife  procedure  is  that  it  may  in- 
crease the  variance  of  the  estimator,    and  this  variance  is  difficult  to 
obtain  analytically.      We  have 

var  { s"   }  =  5  var  {"s    }  -  2  cov  fs~     "s    (1)  }  <  5  var  {"s     }  ,„    „. 

a  a  a      a  a     ,  (3.  9) 

The  covariance  term  involves  the  covariance  between  the     [a(m/2)]  order 
statistic  in  half  the  sample,    and  the     [orm]    order  statistic  in  the  whole 
sample.     If  in  fact    "s     (1)    and    ~s    (2)    are  approximately  uncorrelated,   then 
there  is  no  increase  in  variance.     Even  if  the  variance  is  inflated  enough 
to  make  the  mean  square  error  of  the  estimates  approximately  equal, 
there  is  a  gain  in  that  the  smaller  bias  of  the  jackknifed  estimator  allows 
for  sectioning  the  complete  sample  of  size    m    into     r     smaller  sections 
of  size      m      .      This  gives  a  more  precise  empirical  variance  estimate 
and  smaller  computation  time,    the  latter  following  since  shorter  sections 
are  sorted. 

Unfortunately,   there  is  some  evidence    (Miller,    1964)     that  jack- 
knifing  is  a  poor  technique  to  use  with  estimators  involving  extreme  order 


12 


statistics.     We  give  now  an  illustrative  example. 
Example  -   The  exponential  distribution 

Consider  the  estimation  of  the     0.  999  quantile     s  for  a  unit 

exponential  distribution.      For  simplicity    we    assume  that  the  sample 
size    m    is  such  that     0.  999  m    is  an  integer    k.      We  have 

y    -    F(s)   -     1  -  e'S  , 

s    =    F'^y)    -    -  ln(l-y),  (3.10) 

s       =    -  In  (1-a). 
a 

By  direct  methods  (Cox  and  Lewis,   1966)      one  gets 

12m    |__    (1-ar)  Vm    / 

var  {s    }    =     ■    " L l—  -  l]+  o\AA  .  (3.  12) 

m(l-a)  2m2      ^_a)?         J  ^3; 

These  results  are  used  to  give  Table  1,      The  ratio  (column  8)  of 
the  standard  deviation    (  <r,   column  7)  of  the  estimate  to  the  bias  of  the 
estimate  (column  3)  indicates  roughly  how  feasible  it  is  to  use  averages  of 
estimates     s"  from  samples  of  size    m    to  estimate    T  more  pre- 

cisely,  along  with  an  estimate  of  the  sampling  variance  of  that  estimate. 
Thus  36  samples  of  size    m  ■  10,000    produces  an  estimate  with  a  standard 
deviation  approximately  equal  to  the  absolute  value  of  the  bias,   clearly  an 
undesirable  situation.     Moreover  the  bias  is     -0,051,    so  that  this  estimate 
gives  us  accuracy  of  at  best  two  decimal  places.      This  would  not  be  acceptable 
in  many  cases. 
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The  leading  term  of  the  bias  of  the  jackknifed  estimate    (3.  8) 
is  shown  in  Table  2  (column  2)    as  a  function  of    m.      Column  3 
shows  the  ratio  of  the  standard  deviation  of  the  unjackknifed  esti- 
mator to  this  bias.      Clearly,    sectioning  and  averaging  is  much  more 
feasible.      However,   as  indiated  in  column  4   an  increase  by  only  a 
factor  of    1.  03    in  the  variance  of    T  makes  the  mean  square 

error  of  this  estimate  as  large  as  that  of    s  at    m    «  10,000. 

Clearly,    jackknifing  is  of  greater  utility  in  the  range  of  m    from  about 
1,  000    to    5,  000,   if  the  variance  does  not  blow  up  and  moreover,   it 
is  desirable  to  use  sections  as  short  as  this  if  possible,   to  reduce 
computation  time. 

Estimates  of  the  variance  of     TT  obtained  by  synthetic 

0.999 

sampling  are  given  in  column  5  of    Table  2.      The  quantities  in  paren- 
theses are  estimates   (63  degrees  of  freedom)  of  the  standard  deviation 
of  these  variance  estimates.      There  is  enough  increase  in  the  vari- 
ance over  the  unjackknifed  situation  to  characterize  the  gain 
from  using  the  jackknife  as  being  marginal  rather  than  categori- 
cal with  this  type  of  quantile  estimate.      For  the  exponential  case, 
however,   detailed  calculations  on  computation  times,   bias,    etc.  , 
show  that  in  large  simulations  there  would  be  an  advantage  in 
using    T  with  sections  of  length  approximately    m  -  5,  000. 
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An  alternative  scheme  is  to  use  as  a  "typical  value"  an 

order  statistic  estimate,     s*  from  a  small  independent  sample 

of  size    m/£.      Then  the  estimate    (s*   -  i  T  )/(l-i)    has  a  mean 

a  a 

value  free  of  the     1/m    term  and  a  variance  decreasing  to 

var   {"s    }     as    i     increases.     For    I  =  10    the  variance  is 
a 

1.  36  var     {s    };      for    i  =  20    it  is     1.  13  var     {T   },    an  in- 
a  a 

crease  smaller  than  that  found  for    T     qQQ'     However,   this 
technique  is  difficult  to  use  for  an    m    much  less  than  10,  000 
in  estimating  a    0.  001     quantile. 

IV.     QUANTILE  ESTIMATION  -  Finite  Sample  Considerations  for    s 

a 

Experience  shows  that  for  extreme  quantiles,   convergence 
of      s   (m)    is  so  slow  as  to  be  unacceptable.      Though  problems 
might  be  anticipated  in  the  tails  of  extremely  skew  distributions, 
e.  g.  ,   the  distribution     F(s)  «  s/(l+s)     discussed  above,   they 
occur  elsewhere  too,    as  can  be  seen  roughly  in  the  following 
example. 
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Example  -    The  unit  exponential  distribution 

Consider  again  the  unit  exponential  and  the    a  -  0.  001     quantile. 

This  is 

soaooi   =   '  ln{l  "  Oo001)  ~0-001' 

We  have 

r/  x  -0.001       .    , 

f(so.ooi}  ~e  "J1-* 

Assume  the  initial  estimate  is     s      nrn{0)  =  0.001,   and  that    S      is  less 
than    0.  001.      This  has  probability     0.  001.      Then 

sQ    001(D*  -1  *  0.999   *     -1. 

Clearly  successive    S.'s    are  greater  than  the  estimates  until  the  esti- 
mates get  back  to  zero.     After    i     steps  the  estimate  will  have  moved 

1        / 
0.001  x  1  x  .ZL   1/i    in  the  positive  direction  and  since 

*       1  1 

.Z      -  ~  0.5772  +  in(i)  +   —-    , 

1    —  11  Lt  f. 

the  return  to  the  origin  takes  about    i  =  10         steps.      By  following  through 
this  example  it  can  be  seen  that  the  estimate  is  almost  sure  to  become 
negative  and  take  an  enormously  long  time  to  return  to  zero. 

The  jackknife  technique  is  not  suitable  as  a  means  of  overcoming 
this  difficulty,    partly  because  the  order  of  the  leading  term  in  the 
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asymptotic  expansion  for  the  bias  is  not  known,    but  also  because  if  the 
correct  jackknifing  technique  was  applied  (possibly,   for  a  m  term)  the 

example  makes  it  clear  that  the  estimator  would  still  be  unsuitable. 

Two  other  techniques  suggest  themselves,     Kesten  [1958]    seems 
to  be  one  of  the  few  authors  to  have  noted  the  problem  of  long  runs  occur- 
ring in  the  stochastic  approximation  scheme.     He  suggested  higher-order 
memory  schemes,      for  example,  not  increasing  the  divisor  of    C    in    (2.2) 
by  one  if  all    p     previous  differences  of  sample  values  and  estimators 
were  of  the  same  sign.     Although  this  would  clearly  help  in  the  example, 
it  violates  the  need  for  simplicity.      In  fact,    the  divisors  in  the  estimators 
of  different  quantiles  for  different  statistics  become  different,    and  this 
creates  very  complex  and  time  consuming  programming  procedures. 

Another  technique  is  to  bound  the  estimator     s    (i).      Thus  requir- 
es 

ing    s   (i)  >  0    in  the  example  would  have  obviated  the  problem,    but  uses 

specific  information  about  the  statistics.      Empirically  derived  bounds 

can  be  obtained  at  the  same  time  that  initial  values     s   (0)    and    f(s    ) 

a  a 

are  obtained.     For  example  a  small  pilot  run  using  order  statistics 
will  give  these  initial  estimates  and  bounds  for  outer  quantiles 
(.001     and    .999)    from  the  known  properties  of  the  order  statistics. 
Inner  quantiles  are  bounded  by  outer  quantiles.      Thus  it  has  been  found 
empirically  that  if     A  =  |  bound  -  s"  (0)  |,  then  when    C/A  >  100,   the 
stochastic  convergence  scheme  works  reasonably  well.      It  is,    however, 
ponderous  when  compared  to  the  quantile  estimation  scheme  discussed 
in  the  following  section. 
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V,       THE  MAXIMUM  (OR  MINIMUM)   TRANSFORMATION 

It  is  natural  to  look  for  a  computationally  simple  transformation 

of  the  data  to  overcome  some  of  the  problems  of  estimating  extreme 

quantiles,    and  the  following  transformation  appears  to  solve  most  of  the 

computational  and  statistical  problems. 

Assume,    e„  g.  ,   that    a  >  1/2,   and  consider  the  first    v     S's,   i.  e,  , 

S      ,,,  ,S  .      Then  the  distribution    F~(  s)    of  the  maximum  of  the    v    S's    is 

F(S)=prob{(m?XvS.)<s}    =     {F(s)}V.  (5.1) 

Note  that 

F(s    )  =  FV(s    )  =  <*V=  a'  (5.2) 

a  a 

and 

Sa    =  V    »  (5.3) 

so  that  the    a-quantile  of    F(s)     is  the  same  as  the    a    -  a       quantiles  of 
F(s).     If  we  assume  that    v    divides    m,     and    m1  =  m/v,   taking  maxima 
of  successive  groups  of    v    S's    gives  a  reduced  sample  of  size    m1,    i»  e0  , 


Sl'    S2'    '•••    Sm' 


Thus  we  have  a  reduced  sample  and  have  transformed  the  problem  from 

estimating  an  extreme  quantile  (for  level   a)   to  estimating  a  more  reasonable 

v 
quantile  (for  level    a   ).     For  example,    one  might  take   v   large  enough  to  re- 

v 
duce  the  problem  to  estimating  a  median.      Since    a     =  1/2    in  this  case, 

v=  (In  l/2)/(ln  a)„      The  consequent  values  of    v    are  shown  for  eight    ar's 

in  Table  3. 
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Table  3 


v 
Size  of    v    for    a     =  1/2 


a 


0.900         0.950         0.975         0.980         0.990       0.995         0.998         0.999 


6.6  13.5  27.4  34.3  69.0       138.3         346.2         692.8 

Some  general  points  about  the  use  of  the  transformation  in  quantile 
estimation  follow, 

(1)  The  transformation  of  the  original  sample     {S.  }    into  the  reduced 

sample     {S.  }    is  very  efficient  computationally  since  only    v 

binary  comparisons  and  one  memory  cell  are  needed  to  obtain 

S       from    S        .  .  ,  S    . 
1  1  v 

(2)  The  transformation  uses  no  special  information  about  the 
properties  of    S,    e.  g.  ,    S  >  0. 

(3)  Although  transforming  to  the  median  is  not  necessarily  optimal, 
it  is  known  that  stochastic  approximation  estimates  of  the 
median  work  well  (Cochran    and  Davis     1965), 

(4)  If    a  <  1/2  the  minimum    S"    of  the    v    S's    is  used  instead. 

VI.     THE  MAXIMUM  TRANSFORMATION:     ASYMPTOTIC  VARIANCE 

Using  (2.  4)  we  now  compare  the  asymptotic  variance  of  a  quantile 

estimate  (order  statistic  or  stochastic  approximation)  s   g    from  the  re- 
ar 

duced  sample  with  that  of  the  quantile  estimate  s      from  the  original  sample, 

a 

We  have,    from  (5.  1), 

Ks)-vf(s)    {F(s)}V_1 
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and,    therefore,    using  (5.  3) 


Ksat)  =  vi(sa,)   {F(sa,)} 


v-1 


v-1 


=  v  f(s    )   {F(s    )  } 


f(s    ,)  =  v  f(s    )aV-1     =    v  f(s    )—  . 


Using  these  expressions  in  (2,  4)  we  have 

v  v 

var {r,i *   a  (1-a  >  - 

v  a' 


a  (1-a) 


v 
(1  -  a) 


m  f  (s     )  v  ( 1-a)  a 

a 


(0.  5  <  a  <  1) 


(6,1) 


(6.2) 


=  var   {s    }  •    g(or;v)  . 
a 


(6.3) 


It  can  be  shown  that     g(a;v)    increases  from     1     to  infinity  as    v 
increases  from  1     to    infinity.      Moreover,    for  the  median  transformation, 
vst?  (In  l/2)/\ln  a),   we  have,    as     a    approaches     1 

The  function    g(or;v)    at  the  value  of    v    generating  the  median  transforma- 
tion varies  little  with    a,   as  is  shown  in  Table  4. 

Table  4 


Variation  of    g(or;v)    with    a    at  median  transformation 


a 


g(«»v) 


0.9000 


1.4020 


0.  9990 


693 


1.  4426 


0.  9999 


6931 


1.  4426 
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Thus  the  statistical  efficiency  of  the  estimate  is  decreased  by 
approximately  1/(1.443),    but  the  speed  of  the  maximum  transformation,  com- 
pared to  the  computations  involved  in,  for  example,  the  stochastic  approximation 

estimate  would  probably  make  their  computational  efficiency  about  equal. 
By  computational  efficiency  we  mean  the  relative  computing  times  required 
to  achieve  the  same  variance.      Both  estimates  (reduced  and  non-reduced 
samples)  are  asymptotically  unbiased. 

The  variation  of      g{a;v)    with    v    for     a  =  0.999  is  shown  in  Table  5. 

Table  5 


Variation  of  g(  a;v)    with    v    at    a  =  0.999 

10  50  100  200  500  700  1000 


a1 


g(0.999;v) 


0,990  0.951  0.905       0.819  0.606       0.496       0.368 

1.005  1.025  1.051        1.107  1.297        1.448        1.718 


The  choice  of    v    in  particular  situations  depends  on  computational 
considerations,   particularly  the  meshing  of  estimates  for  various    or's 
for  a  given  statistic,   and  although  no  global  results  can  be  given,  we  discuss 
these  considerations  in  the  next  two  sections. 

VII.     THE  MAXIMUM  TRANSFORMATION:     COMPUTATIONAL  CONSIDERATIONS 
(i  )     Order  statistics 

The  use  of  the  maximum  transformation  with  order  statistic 
quantile  estimates  gives  very  little  gain.      Bias  is  not  an  extreme  considera- 
tion here,   and  computations  for  several  examples  with  the  asymptotic  ex- 
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pansion  (3.5)  show  that  there  is  very  little  change  in  small  sample  bias 
from  one  estimate  to  the  other.      Memory  size  is  not  much  affected.     For 
or=  0,999  we  require     0.  001  m    memory  cells  with  the  original  sample; 

1  1  TY1 

using    v  =  693    to  give    a%  &  1/2    and  a  reduced  sample,  we  need     —  m'  =  —  • 

2  2      693 

memory  cells.      The  slight  advantage  is  lost  when  it  is  noticed  (6.  2)  that  a 
larger  sample  size,     m,     is  required  to  achieve  equivalent  variance. 

If  the  eight  quantiles  with    als     given  in  Table  3  are  required,    the 
unreduced  sample  ordering  (done  at  one  time  for  all  eight  quantiles)  re- 
quires    0.  1  m    memory  cells.      The  eight  quantiles  require  eight  separate 

j         j  i  -4-u  •  4.       1       m        \       m  X     m 

reduced  samples  with  memory  requirements    —  77pr  ,   —         _   , .  .  .  ,  —   -~-   , 

or  a  total  roughly  equivalent  to    0.  1  m . 

The  time  gain  from  sorting  smaller  samples  is  again  marginal. 

(ii)     Stochastic  approximation 

The  greatest  gain  in  using  the  maximum  (or  minimum)  scheme 
comes  from  the  reduction  of  bias  with  the  stochastic  approximation  esti- 
mator (2.  2).      There  are  other  gains;  the  maximum  operation  is  faster  than  the 
the  computation  (2.  2),    and  changing    v    and    ax    is  very  simple,,     One  could 
do  the  eight  quantiles  of  Table  3  from  eight  reduced  samples  obtained  from 
v's    of    7,    14,    28,    35,   70,    140,    350,    700.      These  values  are  close  to  the 
v's     given  in  Table  3  and  all  divide  7  00.      Thus  it  is  easy  to  compute  the 
eight  reduced  samples  simultaneously  in  nested  loops. 

Again  only  a  fixed  number  of  memory  locations  is  required. 
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A  possible  scheme  for  selecting  initial  values  for  the  stochastic 

approximation  estimator  is  the  following.      Let    S'  ,  .  .  .  ,    S       t    be  the 

reduced  sample.      Take  the  first  three  values,     S'      S*  ,   S      and  order 

them  as    Sj, . ,    Sf_  Vl    S.'    ..     If    v    is  such  as  to  make    a' ^  1/2,   use    S 

(1)        (2)        (3)  \l) 

as    ~s    t  (0),   the  initial  value,    and  since    S1.     and    S  estimate  the 

a  \ 1)  \i) 

13  — 

—    and    —    quantiles  of    F(s),    estimate    f ( s    ,)    as: 

4  4  a 


1 


4S(3)-S'(2) 


2  1 4  (sUrsU>) 

(This  is  the  simplest  of  many  density  estimates.  ) 
The  stochastic  approximation  estimate 


t       ~ 


1-sgn  {S.  -  s    ,(i-l)} 


s    i  (i)  «  s^i(i-l)     "  T 
a  a  l 


uses     S       as  the     S.  ,    i=l,    in  the  equation. 


a 


•-  a 


(7.1) 


=  1,2,...  ,m,    {7.  2) 


The  value  of    C    is  not  critical  and  this    estimate  should  be  well- 
behaved0      The  jackknife  technique  (3.  8)  can  be  applied,   using  alternate 
values    S         for  the  sub-estimates,   although  it  is  not  known  if  the  leading 
term  in  the  bias  is  0(l/m). 

Analytical  results  for     s    ,    and  the  jackknifed  estimate    *s    i  ,   cannot 
be  obtained  and  sampling  results  are  given  in  Tables  6,   7  and  8  for    S 
having  a  unit  exponential  distribution  and    S    having  the  distribution 
F(s)  -  s/(l+s). 
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(iii)     Sampling  results 

Results  are  given  in  Tables  6,   7  and  8  from  an  extensive 

simulation  to  investigate  the  properties  of  quantile  estimates  based  on 

(7.  1)  and  (7.  2).      The  columns  in  these  tables  show  successively    m, 

the  sample  size;  m  ,   the  reduced  sample  size;    the  expected  values 

and  standard  deviations  of  the  jackknifed  estimator     s'  ,    based  on    (7.  1) 

a 

and    (7.  2)    and  the  expected  values  and  standard  deviations  of  the  straight- 
forward estimator     s    ,    based  on   (7.  1)    and  (7    2).      The  last  column 
in  the  Tables  gives  the  asymptotic  standard  deviations  from    (6.  2).     Note 
that    v  *  7  00.      To  indicate  the  precision  obtained  in  the  sampling  ex- 
periment the  quantities  in  parentheses  in  the  last  row  of  each  table  are 
estimates  (7  degrees  of  freedom)  of  the  standard  deviations  of  the  estimates. 
In  Table  6,   the  minimum  transformation  gives  estimates  of 

•  =  0.  001001,   the    0.  001     quantile     of  the  unit  exponential  distribu- 

5S1  7  00 

tion.      The  jackknifed  estimator     s"  t,  where    a%  «=  1  -  [1-0.  001]         ,  has 

converged  by  the  time    m    reaches  roughly  30,000.      There  is  still  a 

small  bias  in  the  unjackknifed  estimator     s    ,  ,   approximately  one  tenth 

of  the  standard  deviation  of  the    estimator  at    m  «=  28,  000.      There  is  a 

penalty  paid  in  terms  of  a  larger  standard  deviation  for  the  jackknifed 

estimator  (  **  20%)    and  the  standard  deviation  for    %  ,    is  larger  than 

a 

predicted  by  the  asymptotic  formula  (6.  2).      This  is  due  to  the  added 
variability  introduced  by  the  density  estimate  (7.  1)    and  the  finite 
sample  size    m. 
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TABLE  6 


Estimation,  of    0.  001     quantile  with  minimum  transformation 
and  stochastic  approximation,  with    (s    .)    and  without    (s    ,) 
jackknife.     a1  =  1  -[1  -0.  001  ]v,   v  =  700.       Unit  exponential 
distribution. 


F(s)  »  1  -  e 


-  s 


F(s)-1   -[1-F(s)]v;      sQ    001- 0.001001 


Asymptotic 

m 

i 

1 

St. 

dev.  (s  ,) 
a 

E<v> 

st.  dev.  (s*  ,) 

st.  dev. 

! 
j 

1 

1  0.  00107 

0. 

00095 

0.  00111 

0. 00065 

4200 

6  i 

0.  000587 

5600 

8 

0. 00106 

0. 

00081 

0.  00109 

0.  00058 

0.  000509 

7000 

10 

0.  00104 

0. 

00073 

0.  00108 

0. 00053 

0.  000455 

8400 

12 

0.  00103 

0. 

00067 

0.  00107 

0.  00050 

0.  000415 

9800 

14 

0.  00103 

0. 

00062 

0.  00106 

0.  00047 

0.  000385 

11200 

16 

0.  00102 

0. 

00058 

0.  00106 

0.  00045 

0.  000360 

12600 

18 

0.  00102 

0. 

00055 

0.  00105 

0.  00043 

0.  000339 

14000 

?0 

0. 00102 

0. 

00053 

0.  00105 

0.  00042 

0.  000322 

16800 

24 

0.  00101 

0. 

00049 

0.  00104 

0.  00040 

0. 000294 

19600 

28 

0.  001008 

0. 

000456 

0.  001034 

0.  000378 

0.  000272 

28000 

40 

0. 001003 

0. 

000400 

0.  001025 

0.  000344 

0.  000228 

42000 

60 

0. 000999 

0. 

000346 

0.  001016 

0.  000310 

0. 000186 

(0.  000001) 

(0. 

000001) 

(0.  000001) 

(0.  000002) 
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TABLE  7 

Estimation  of    0.  999     quantile  with  maximum  transforma- 
tion and  stochastic  approximation,   with    (IT  t)    and  without 
(s    t)    jackknife.      a'  =  0.  999vt  v  g  700.      Unft  exponential 
distribution. 


F(s) 


1  -  e 


-s 


V. 


F(S)  =  [F(s)]vJ 


S0.999  =  6-908 


m 

m! 

E(f  ,) 
a 

st.  dev.    (IT  t) 

E<v> 

st.    dev.    (f   t) 
a 

Asymptotic 
st.    dev. 

4200 

6 

6.946 

0.  97  0 

6.962 

0.  640 

0.  587 

5600 

8 

6.942 

0.  825 

6.957 

0.  569 

0.  508 

7000 

10 

6.931 

0.736 

6.949 

0.  522 

0.  455 

8400 

12 

6.929 

0.  670 

6.946 

0.  488 

0.  415 

9800 

14 

6.926 

0.  619 

6.942 

0.  461 

0.  384 

11200 

16 

6.923 

0.  581 

6.940 

0.  440 

0.  359 

12600 

18 

6.  920 

0.  548 

6.936 

0.  423 

0.  339 

14000 

20 

6.919 

0.  520 

6.  934 

0.  406 

0.  321 

16800 

24 

;  6.919 

0.  481 

6.  933 

0.  386 

0.  293 

19600 

28 

6.916 

0.  448 

6.929 

0.  365 

0.  27  2 

28000 

40 

6.  912 

0.  384 

6.  923 

0.  326 

0.  227 

42000 

60 

6.912 
(0.  001) 

0.  332 
(0.  001) 

6.920 
(0.  001) 

0.  294 
(0.  001) 

0.  186 
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TABLE  8 


Estimation  of    0.  999    quantile  with  maximum  transforma- 
tion and  stochastic  approximation,  with    (s    ,)    and  without 


(s    ,)    jackknife.     a'  =  0.  999v,   v  »  7  00. 

a  • 


a 


F(s)  =  s/(l+s);      F(s)=  [s/(l+s)]v;  sQ  ^  =  999 


m 

m» 

E<v> 

st.  dev.  (s  ,) 

E(s  ,) 

a 

st.  dev.  (s  .) 
a 

Asymptotic 
st.  dev. 

4200 

6 

1048 

2902, 

1233 

1375 

587 

5600 

8 

1196 

17  29 

1217 

1233 

508 

7000 

10 

1076 

2416 

1197 

1195 

455 

8400 

12 

1126 

1412 

1180 

1042 

415 

9800 

14 

1063 

1422 

1163 

872 

384 

1200 

16 

1076 

1078 

1148 

802 

359 

2600 

18 

1052 

1082 

1137 

738 

339 

4000 

20 

1060 

1123 

1127 

753 

321 

6800 

24 

1048 

87  2 

1112 

685 

293 

9600 

28 

1042 

828 

1103 

698 

27  2 

.8000 

40 

1025 

7  47 

1077 

624 

227 

t2000 

60 

1016 

551 

1055 

494 

186 

(    1) 

(  23) 

(    1) 

(  28) 
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In  Table  7  the  maximum  transformation  gives  estimates  of 
s  =  6.  908,   the    0.  999  quantile  of  the  unit  exponential  distribution. 

The  bias  is  smaller     for  the  jackknifed  estimator,   and  extrapolation  of 
figures  in  column  3  gives  convergence  in  the  mean  for  most  purposes 
by    m  *  50,000.     Note,   however,   that  the  standard  deviation/bias  ratio 
is  large    (  ~  100)    down  the  column.      Again,   there  is  an  inflation  in  the 
standard  deviation  of  the  jackknifed  estimator. 

The  bias  is  examined  more  closely  in  Figure  1  where  the  absolute 
value  of  the  (estimated)  bias  is  plotted  on  a  log  scale  against    m    for  both 
estimators.      The  curvature  indicates  that  higher  order  terms  than    1/m 
are  still  important  for  these  sample  sizes.      Except  for  the    m  «  42,000 
point,   the  bias  in  the  jackknifed  estimator  appears  to  be  falling  off  more 
rapidly  than  the  bias  for    "s"  ,    (here    a'  ■  0.  999         ).     No  formal  regression 
analyses  of  these  sampling  results  have  been  done. 

Note  that  for  both  estimators  the  absolute  value  of  the  bias  (except 
for  the  last  point)  is  less  than  the  absolute  value  of  the  bias  in  the  order 
statistic  estimator.      This  is  given  by  (3.  1)  and  plotted  as  a  sequence  of 
crosses  in  Figure  1. 

Bias  is  more  serious  for  the  extreme  case  of  the  distribution 
F(s)  »  s/(l+s),   as  shown  in  Table  8.      The  jackknifed  estimator  is  ad- 
vantageous here  where  the  standard  deviation/bias  ratio  is  smaller  than 
for  the  exponential  distribution  and  convergence  is  a  little  slower. 
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Figure  1.      Exact  bias  for  the  order  statistic  estimator     s"  t    of  the  0.  999  quantile 


a 


of  the  unit  exponential  distribution,   and  estimated  bias  for  estimators     s    ,     and 


a 


lisino  tVip  ma  virrmm   trsnsfnrmatinti     ( a     =   0    QQQ         v  =  700^     and    storhastir. 
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VIII.      CONCLUSIONS  AND  FURTHER  WORK 

The  conclusion  of  this  investigation  is  that  the  maximum  trans- 
formation and  the  stochastic  approximation  scheme  (7.  2)  yields  a 
quantile  estimator  for  extreme  quantiles  (e.  g.  ,   x      QQQ)    which  is  fast 
and  linear  in  sample  size    m;  requires  a  small,   fixed  amount  of  storage, 
and  without  using  external  information  provides  a  virtually  bias  free 
estimate  even  in  extreme  cases  (such  as  the     s/(l+s)    distribution)  for 
sample  sizes  of    m  =  50,  000.     In  most  cases  smaller  samples  will  be 
satisfactory. 

There  are  possibilities,   based  on  the  properties  of  extreme 
value  distributions,   of  improving  the  quantile  estimators  performance 
even  more. 

Note  too  that  the  estimator  can  be  used  to  advantage  for  smaller 
a    than    0.  999,   and  it  is  completely  suited  for  use  with  global  variance 
reduction  techniques  such  as  the  use  of  control  variables     (Gaver,    1969). 
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I.    SUPPLEMENTARY    NOTES 


3       ABSTRACT 


12.    SPONSORING    Ml  LI  TAR  Y    ACTIVITY 


The  large  random  access  memory  and  high  internal  speeds  of  present 
day  computers  can  be  used  to  increase  the  efficiency  of  large-scale  simu- 
lation experiments  by  estimating  simultaneously  several  quantiles  of  each 
of  several  statistics.   In  order  to  do  this  without  inordinately  increas- 
ing programming  complexity,  quantile  estimation  schemes  are  required  which 
are  simple  and  do  not  depend  on  special  features  of  the  distributions  of 
the  statistics  considered.   We  discuss  limitations,  when  the  probability 
level     is  very  high  or  very  low,  of  two  basic  methods  of  estimating 
quantiles.   One  method  is  the  direct  use  of  order  statistics;  the  other 
is  based  on  the  use  of  stochastic  approximation. 

Several  modifications  of  these  two  estimation  schemes  are  considered, 
In  particular  a  simple  and  computationally  efficient  transformation  of  the 
simulation  data  is  proposed  and  the  properties  (i.e.  bias  and  variance)  of 
quantile  estimates  based  on  this  scheme  are  discussed.  • 
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